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ANNUAL  EXTREME  LAKE  ELEVATIONS  BY  TOTAL 
PROBABILITY  THEOREM 


Harold  E.  Kubik,  P.E.* 


ABSTRACT:  Annual  extreme  water  levels  on  the  Great  Lakes,  whether  maximums  or  mini- 
mums,  have  a  high  serial  dependence.  Therefore,  application  of  traditional  frequency  analysis 
techniques  must  be  interpreted  in  a  different  manner  and  more  sophisticated  statistical  techni¬ 
ques  must  be  applied  to  account  for  this  dependence. 

The  terms  'Tcrccnt  Chance  Exceedance"  and  "Return  Period"  arc  applied  to  the  expectation 
values  of  annual  extreme  events  that  are  random  in  nature  and  have  an  equal  likelihood  of 
occurring  in  any  given  year.  Annua!  extreme  lake  elevations  on  the  Great  Lakes  arc  not  random 
from  one  year  to  the  next;  therefore,  the  usual  terms  to  define  the  expectation  should  not  be 
used  to  describe  the  events.  An  acceptable  term  is  "Percent  of  Years  Exceeded."  This  is  com¬ 
parable  to  the  label  'Tcrcent  to  Time  Exceeded"  thai  is  applied  to  flow-  or  elevation-duration 
curves. 

Decomposition  of  the  annual  extremes  into  two  parts,  one  containing  the  highly  dependent 
part  and  the  other  contair  ing  the  random  part,  is  one  method  of  dealing  with  the  dependence 
in  the  lake  elevations.  Appropriate  statistical  analyses  can  be  applied  to  the  separate  parts  and 
then  the  individual  results  combined  to  obtain  the  final  frequency  relation.  This  study  develops 
mean  monthly  lake  elevation  duration  curves  to  represent  the  dependent  part  and  wind  setup 
frequency  curves  for  the  random  part.  These  parts  are  then  combinc'd  by  application  of  the  total 
probability  theorem. 

Seasonality  of  the  occurrence  of  both  parts  was  found  to  be  very  important.  Therefore,  the 
complete  analysis  was  done  for  the  six-month  fall-winter  period  and  the  six-month  spring-sum 
mer  period.  The  two  curves  were  combined  by  the  union  of  probabilities. 

This  technique  does  not  gain  any  information  over  a  smooth  curve  drawn  through  the 
observed  events  when  applied  to  long-record  gauges  like  Cleveland  and  Buffalo  harbor  This 
technique  is  most  useful  in  application  to  short-record  stations.  The  long  record  of  monthly  lake 
elevations  for  a  particular  lake  provides  the  information  for  the  highly  dependent  part.  The 
wind  setup  information  for  a  short-record  gauge  may  be  correlated  with  a  nearby  long-record 
gauge  to  be  made  more  indicative  of  a  longer  record. 

Application  of  this  method  to  the  Buffalo  harbor  and  Cleveland  gauges  resulted  in  com¬ 
puted  "1%  of  Years  Exceeded"  elevations  of  579.79  feet  (176.72  meters)  and  574.72  feet  (175  17 
meters)  (1GLD  1955),  respectively. 


Introduction 

The  Great  Lakes  are  an  important  natural  resource  that  have  attracted  a  variety  of  human  activities  — 
waterborne  commerce,  water  supply,  hydroelectric  power,  recreation,  and  habitation  —  to  mention 
some  of  the  more  important  ones.  The  wise  management  of  the  lakes  and  the  land  adjacent  to  these 
bodies  of  water  requires  some  anticipation  of  the  likely  lake  levels.  The  establishment  of  non-building 
zones,  for  instance,  relies  on  an  estimate  of  the  likely  maximum  water  levels.  Planners  and  designers 
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involved  in  the  location  of  boat  harbors  and  depth  of  navigation  channels  need  information  on  the 
expected  minimum  water  levels.  The  computation  of  these  likely  levels  is  complicated  by  the  long-term 
fluctuations  of  the  Great  Lakes'  water  levels. 

The  normal  procedure  of  establishing  zones  that  are  subject  to  flooding,  especially  in  riverine  con¬ 
ditions,  is  to  compute  a  frequency  curve  based  on  the  available  flood  data.  One  of  the  requirements  for 
a  frequency  analysis  is  that  the  events  are  random,  independent  events.  The  Great  Lakes'  water  level 
data  do  not  meet  this  requirement.  The  annual  extreme  values  are  highly  correlated  from  year-to-year 
because  of  the  strong  dependence  on  the  mean  level  during  the  year.  Therefore,  normal  frequency 
analysis  procedures  can  not  be  applied  to  these  data.  It  is  possible  tc  use  statistical  analysis  techniques 
to  analyze  the  extremes  by  separating  each  event  into  two  components:  one  the  long-time  scale,  highly 
dependent  fluctuation  represented  by  mean  lake  elevations;  and  the  second  the  short-time  scale,  very 
independent  fluctuations  generally  caused  uy  w!..J  stress  on  the  lake.  These  components,  alter  in¬ 
dividual  analysis,  can  be  recombined  to  provide  an  indication  of  the  percent  of  annual  instantaneous 
maximum  events  that  will  exceed  a  given  elevation.  Application  of  these  techniques  to  the  annual 
minimums  would  provide  the  percent  of  annual  events  that  do  not  exceed  (nonexceedance)  a  given 
elevation. 


Data  Available  for  Analysis 

Very  long  records,  by  usual  hydrologic  standards  in  the  U.S.,  of  mean  monthly  water  levels  on  Lake 
Erie  have  been  observed  at  the  Cleveland  and  Buffalo  harbor  gauges  The  Cleveland  record  is  con¬ 
tinuous  since  January  1860  (129  years  through  1988).  And,  although  some  mean  monthly  values  were 
recorded  for  the  1860-1869  period,  the  continuous  record  at  Buffalo  harbor  began  in  March  1887  (nearly 
10?  years  through  1988).  A  continuous  record  of  annual  instantaneous  extremes  are  available  for  the 
period  1900-1988  at  Buffalo  harbor  and  for  period  1904-1988  at  Cleveland.  Figure  1  is  a  plot  of  mean 
annual  lake  elevations  at  Cleveland.  One  could  conclude  from  this  plot  that  the  1 29  years  ol  information 
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Figure  1.  Mean  annual  elevations  on  Lake  Erie,  Cleveland  gauge. 
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is  really  a  very  short  period.  The  water  levels  in  the  1860's  began  fairly  high  and  gradually  moved 
downward  until  the  dramatic  decrease  in  the  early  1930s  to  a  low  in  1934.  After  this  lowest  annual 
level,  the  levels  generally  increased  to  the  high  experienced  in  1986.  Fitting  the  mean  annual  elevations 
with  a  smooth  curve  makes  it  appear  that  only  one-half  of  a  cycle  has  been  observed.  The  high  persist¬ 
ence  has  effectively  reduced  our  knowledge  of  how  often  to  expect  extreme  high  or  low  water  levels. 


Annual  Persistence 

Computation  of  the  serial  correlation  coefficient  for  the  annual  extremes,  a  measure  of  how  well  one 
year  is  related  to  ti.e  next  year,  provides  a  quantitative  evaluation  of  persistence.  The  lag  1  correlations 
for  the  annual  maximum  events  are  0.752  and  0.406  for  Cleveland  and  Buffalo  harbor,  respectively. 
The  strength  of  this  persistence  becomes  more  clear  when  it  is  noted  that  lags  1  through  4  (this  year  is 
related  to  4  years  previous)  are  found  to  be  significant. 

Comparison  of  a  time  series  plot  of  the  annual  instantaneous  extremes.  Figures  2  and  3,  with  the 
mean  annual  values  illustrate  that  the  extremes  have  the  same  pattern  as  the  mean  annual  values. 

As  the  general  lake  levels  arc  a  large  component  of  the  annual  extreme,  then  removal  of  this  com¬ 
ponent  could  result  in  values  that  do  meet  the  frequency  requirement  of  being  random  and  independent. 
This  separation  was  accomplished  by  noting  the  month  of  the  extreme,  and  subtracting  the  mean  month¬ 
ly  water  level  at  the  gauge  from  the  instantaneous  extreme.  This  provided  a  change  in  elevation  value 
that  is  termed  “wind  setup."  (Note,  wind  setup  is  negative  for  the  annual  instantaneous  minimums.) 
Serial  correlation  computations  indicate  that  the  wind  setup  values  are  random  events;  therefore,  fre¬ 
quency  analysis  techniques  can  be  applied  to  these  data.  This  provides  one  component  of  the  annual 
extreme  values. 
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Figure  2.  Annual  instantaneous  maximums  at  Buffalo  harbor  and  Cleveland. 
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BUFFALO  HARBOR  and  CLEVELAND 


Years 

Figure  3.  Annual  instantaneous  minimums  at  Buffalo  harbor  and  Cleveland. 


A  second  component  is  the  long-term  lake  fluctuations.  This  component  is  represented  by  a  mean 
monthly  elevation  duration  curve.  These  values  are  highly  correlated,  so  the  frequency  label  would  be 
"Percent  of  Time  Exceeded"  to  imply  that  they  are  not  independent  even's 

Seasonality  of  Extremes 

It  became  apparent  as  this  study  progressed  that  seasonality  was  important  in  the  analysis  of  the 
extreme  events.  The  Buffalo  harbor  and  Cleveland  maximum  levels  occur  at  entirely  different  times  of 
the  year.  The  Buffalo  harbor  maximums  occur  in  the  fall-winter  months,  indicating  a  response  to  the 
winter  storms  because  the  monthly  lake  levels  are  usually  lower  during  the  winter  months.  At 
Cleveland,  the  maximums  occur  in  the  spring-summer  months  indicating  that  the  seasonal  high  mean 
lake  levels  are  the  larger  determining  factor.  This  is  illustrated  in  Figure  4  for  the  maximum  and  min¬ 
imum  values  at  Buffalo  harbor  and  in  Figure  5  for  Cleveland.  For  this  study,  the  data  were  divided  into 
two  6-month  seasons.  The  fall-winter  season  included  the  months  of  October,  November,  December, 
January,  February,  and  March.  The  spring-summer  season  included  the  months  of  April,  May,  June, 
July,  August  and  September 

The  minimum  levels  are  more  influenced  by  the  mean  monthly  lake  levels,  although  the  effect  of 
wind  related  minimums  can  be  noted  at  the  Buffalo  harbor  gauge  for  March  and  April  (February  has 
the  lowest  average  monthly  elevation  at  both  gauges). 


Number  of  Months  Number  of  Months 
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Figure  4.  Mouths  of  annua!  rnaximums  and  minimum.;,  Buffalo  harbor. 
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Figure  5.  Months  of  annual  rnaximums  and  minimums,  Cleveland. 
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Total  Probability  Method 

Now  that  the  annual  extremes  have  been  decomposed  into  two  components  for  each  of  the  seasons, 
some  method  must  be  applied  to  put  the  data  back  together  again.  This  can  be  done  by  applying  the 
total  probability  theorem.  The  total  probability  theorem,  as  presented  in  most  statistics  texts  (Benjamin 
and  Cornell  1970)  is: 


where: 


n 


P[A]  =  2  P[A  I B,]  P[bJ 
;  =  1 


p[a\b]  is  the  conditional  probability  of  the  event  A  given  that  event  B  has  occurred,  and 

B  is  a  set  of  mutually  exclusive,  collectively  exhaustive  events  of  size  n. 

The  conditional  probability  relations  are  derived  by  selecting  a  given  lake  elevation  and  then  adding 
this  value  to  the  wind  setup  frequency  curve.  This  gives  a  single  conditional  frequency  curve  that  has 
a  certain  probability  of  occurring.  Many  of  these  conditional  frequency  curves  can  be  computed  to 
completely  define  the  range  of  water  level  occurrences.  Figure  6  shows  seven  such  conditional  frequen¬ 
cy  curves.  Each  curve  is  labeled  with  the  mean  monthly  lake  elevation  used  to  derive  the  curve  and  the 
percent  of  time  that  this  elevation  is  exceeded.  The  horizontal  axis  (Percent  of  Years  Exceeded)  is  the 
P[A  I B]  portion  of  the  total  probability  equation.  The  P[B]  portion  of  the  equation  is  the  amount  of 
probability  (percent  of  time)  represented  by  each  curve.  This  can,  simplistically,  be  the  probability 
computed  by  adding  one-half  of  the  differences  between  the  two  adjacent  curves.  For  example,  the 
probability  associated  with  the  curve  based  on  a  monthly  elevation  of  571 .06  (exceeded  50%  of  the  time) 
would  be  [  (70%-50%)/2  +  (50%-30%)/2  i/100  =  0.20  units  of  probability.  Doing  this  for  all  the  curves 
will  yield  a  set  of  values  that  add  up  to  1.0.  In  other  words,  all  the  possible  mean  monthly  elevations 
have  been  considered  by  discrete  increments  of  probability. 

The  total  probability  equation  isapplied  at  each  desired  elevation  to  compute  an  expectation  of  that 
elevation  being  exceeded.  To  derive  a  frequency  relation,  several  elevations  would  be  selected  covering 
the  expected  range  of  values.  Figure  7  illustrates  in  a  graphical  way  what  the  equation  is  doing.  An 
elevation  of  574.0  was  selected,  then  the  Percent  of  Years  Exceeded  for  each  curve  is  noted  and  plotted 
on  Figure  7  against  the  Percent  of  Time  (converted  to  probability  by  dividing  by  100).  After  all  of  the 
intercepts  have  been  plotted,  a  smooth  curve  is  drawn  through  the  points.  (Note  that  not  all  of  the 
curves  used  to  develop  Figure  7  are  shown  on  Figure  6.)  For  an  elevation  of  574.0,  the  expected  I^ercent 
of  Years  Exceeded  of  4.37%  is  the  probability  weighted  average,  or  the  area  under  this  curve. 

This  computational  procedure  is  often  called  coincident  frequency  analysis  in  Corps  of  Engineers 
publications.  As  these  computations  are  laborious,  a  computer  program  has  been  written  (HEC 1989) 
that  accepts  as  input  the  mean  monthly  elevation-duration  relation  and  the  wind  setup  frequency 
relation.  The  program  then  generates  the  requisite  conditional  curves  and  evaluates  the  total  probability 
theorem  for  several  elevations  to  provide  an  elevation  expectation  relation. 


Elevation  in  Feet,  IGLD  (1955) 


Figure  6.  Conditional  frequency  curves,  Cleveland,  spring-summer  season. 
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Figure  7.  Graphical  representation  of  applying  the  total  probability  equation. 


Results 

The  final  results  were  found  by  combining  the  computed  “frequency  curves"  for  each  of  the  seasons 
This  is  done  by  the  union  of  probabilities.  This  equation  is: 

Pc=  10011  -(1  -  I’,/ 100)  (1  -P2/ 100)| 

where:  Pc  =  the  combined  frequency  value  in  percent  for  the  selected  elevation, 

Pj  =  the  frequency  value  in  percent  for  season  1  for  selected  elevation,  and 

P2  =  the  frequency  value  in  percent  for  season  2  for  selected  elevation. 

Lake  elevation  expectation  curves  were  computed  for  Buffalo  harbor  and  Cleveland  by  the  procedure 
described  herein.  The  monthly  duration  curves  were  based  on  the  period  1860-1988  while  the  wind 
setup  curves  were  based  generally  on  the  1900-1988  period  Therefore,  these1  curves  should  be1  fairly 
representative  of  the  1860-1988  period.  The  observed  instantaneous  annual  maximums  have  been  as¬ 
signed  [-dotting  positions  and  plotted  along  with  the  derived  curves  on  Figures  8  and  9.  The  “IT  of 
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Years  Exceeded''  elevations  computed  by  this  procedure  were  579.79  (176.72  meters)  and  574.72  feet 
(175.17  meters)  (1GLD  1955)  for  Buffalo  harbor  and  Cleveland,  respectively. 

The  utility  of  this  procedure  is  in  the  application  to  gauges  that  have  fairly  short  records.  Mean 
monthly  elevation  duration  relations  based  on  a  fairly  long  period  are  available  for  each  of  the  Great 
Lakes.  The  wind  setup  frequency  'elation  for  an  individual  station  may  be  used,  or  the  relation  could 
be  adjusted  by  the  “two-station  comparison"  procedures  (Interagency  Committee  1982)  recommended 
for  flood  flow  frequency  computations.  Application  of  these  procedures  to  a  station  with  a  fairly  short 
record  should  provide  elevation  expectation  curves  that  are  representative  of  a  much  longer  period 
than  the  period  of  recorded  maximum  or  minimum  instantaneous  lake  elevations. 
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